urlfile="https://raw.githubusercontent.com/gthampak/Arm_Guy_MATH150_Project/main/HELPdata.csv"

HELPdata <- read_csv(url(urlfile))

Introduction

Connecting with medical care is a choice. Many different factors could be involved when deciding whether or not to connect with medical care. Perception of your own personal health influences your decision to connect with medical care for non-major medical emergencies. Although we are able to do some basic self diagnosis with help of the internet, there is a high chance of a misdiagnosis given the various issues with access to useful and accurate medical information. Since we do not deem it a necessity if our health is not in a critical condition, we may decide not to connect with medical care. Apart from the personal health perceptions, there are also perceptions of health care system, which could include the ability to pay for health treatment. Lastly, one may not be in the optimal mental space to decide whether or not to connect with medical care, which could be due to poor mental health or substance abuse.

There have been many studies and surveys which try to gauge the accuracy of an individual personal health perception and their views towards healthcare. We attempt to widen the scope of personal perception of general health to investigate factors which could influence an individual’s view and opinions on their own health as well as primary healthcare. Better understanding the influences on one’s decision to connect with medical care can help focus efforts in specific sectors to give care to people who require care but don’t know it themselves.

(move from general to specific)

background info (look at source 1)

motivation?

Aim/Hypothesis

Our primary goal was to assess how personal perception of someone’s health and their perception of healthcare affects the effectiveness of novel multi-disciplinary clinic for linking patients in a residential detoxification program to primary medical care.

Methods

First, we looked at variables that are related to health and health perception and separated them into different categories.

General/ASI-Composite scores

Questions related to opinion on and habits towards healthcare

SF-36 Scores

Drugs-related variables

Drug and Healthcare related variables

Interview’s perspective on Patient

Demographics and Education-related variables

Model Building

The functions we used for our primary data analyses are coxph(Surv()) to test for significance of Hazard Ratio coefficients and survfit(Sruv()) to plot survival curves. First we put all the variables above into a single model and removed the ones with highest significance one at a time. We also ran models with variables from a single category (above), and removed variables with highest signifiance. We did this to prevent putting highly correlated variables from the same category into the main model. We also tried looking for interaction between variables, but none were significant.

(make sure replicable)

Explanation of variables we explored, why we explored them (in relation to our mainhypothsis/primary aim)

Categorize variables explored (sf scores??)

Results (Model goes here!)

Exploratory Data Analyses

First, we wanted to see whether a higher pain score translate to a lower self rating of health. We hypothesize that it does.

From this visualization, we see that as pain score increases, the people who rated their health to be excellent increases, which makes sense.

Next, we were interested in the correlation between age and perception of future health. We hypothesis that the people of older age will more likely believe that their health will get worse. This is mostly true as the percentage of people who believe their health will get worse increases as age increases, as shown below.

Next, we looked for patterns between perception of mental health and general health. This visualisation is an effort to answer the question, do individuals incorporate their mental health to how they rate their general health. It seems that the majority of people who think their general health is bad also are suffering from poor mental health, which means that mental health is also considered in overall general health.

mental <- matrix(table(HELPdata$f1f, HELPdata$b11d), ncol=5, nrow=4)

fig1 <- plot_ly(x = c("Definitely true", "Mostly true", "Don't know", "Mostly false", "Definitely false"), y = c("Rarely/never", "Some of the time", "Occas/moderately", "Most of the time"),
    z = mental, type = "heatmap")

fig1 <- fig1 %>% layout(
    title = "Relationship Personal mental health perception and general health perception",
      xaxis = list(title = "My health is excellent"),
      yaxis = list(title = "I felt depressed"))

fig1

Now, we investigate variables associated with education as education may affect the individuals perception of health care, their knowledge of the health care system and their ability to self diagnose themselves. We plot a quick histogram to see whether years of education (a9) correspond with the high school variable (hs_grad) and it does. 12 years of formal education is when high school finishes, and this is the case here.

Here, we note that a lot of the patients in the study are high school graduates.

T-tests

With that, we wanted to know whether high school graduates and non-high school graduates view the importance of medical treatment differently. We performed a t-test between hs_grad and d5_rec, which asks patients whether Medical treatment is important (0=No, 1=Yes). We get a p-value of 0.7741, and cannot reject the null hypothesis that the proportion of patients who think medical treatment is important are the same between high school graduates and non-graduates.

t.test(d5_rec~hs_grad, data = HELPdata)
## 
##  Welch Two Sample t-test
## 
## data:  d5_rec by hs_grad
## t = 0.28744, df = 198.75, p-value = 0.7741
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.09854964  0.13218183
## sample estimates:
## mean in group 0 mean in group 1 
##       0.5523810       0.5355649

Another factor we thought could potentiall affect patients view on the importance of medical treatment are those who use substances. We chose to do a t-test to see whether the proportion of people who view medical treatment as important is different between patients who have alcohol as their primary substance and those who do not. The p-value for this t-test is 0.03, which is a significant difference and we reject the null hypothesis. The test shows that the patients with alcohol as their primary substance is more liely to view medical treatment as important.

t.test(d5_rec~alcohol, data = HELPdata)
## 
##  Welch Two Sample t-test
## 
## data:  d5_rec by alcohol
## t = -2.1824, df = 254.19, p-value = 0.02999
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.23130020 -0.01187097
## sample estimates:
## mean in group 0 mean in group 1 
##       0.4640000       0.5855856

Significant Variables after prelimiinary models * group * alcohol * coc_her * hs_grad * a9 (education) * pf * any_util

Survival Analysis Model

HELPdata_survfit <- survfit(Surv(dayslink, linkstatus) ~ group + alcohol + hs_grad + any_util + age, data=HELPdata)

#ggsurvplot(HELPdata_survfit, conf.type = "TRUE")

coxph(Surv(dayslink, linkstatus) ~ group + alcohol + hs_grad + any_util + age, data=HELPdata) %>% tidy()

Our final model includes the following variables: * group * alcohol * hs_grad * any_util * age

From this coxph model, group‘s coefficient estimate is 1.73914222 which means that patients in treatment group 1 are \(e^{1.73914222} = 5.692458\) times as likely to recieve primary care at any given time that patients from treatment group 0. p-value for this coefficient is 1.621803e-14, which means we reject the null hypothesis that treatment group has no effect on ’risk’ of seeking primary healthcare (with all else constant).

alcohol‘s coefficient estimate is 0.49420991 which means that patients with alcohol as their primary substance are \(e^{0.49420991} = 1.639203\) times as likely to recieve primary care at any given time that patients who do not have alcohol as their primary substance. p-value for this coefficient is 1.527822e-02, which means we reject the null hypothesis that alchol as primary substance does not correlate with ’risk’ of seeking primary healthcare (with all else constant).

hs_grad‘s coefficient estimate is -0.54287790 which means that patients with who are high school graduates are \(e^{-0.54287790} = 0.5810736\) times as likely to recieve primary care at any given time that patients who are not high school graduates. p-value for this coefficient is 3.189917e-03, which means we reject the null hypothesis that high school graduation does not correlate with ’risk’ of seeking primary healthcare (with all else constant).

any_util‘s coefficient estimate is -0.40873422 which means that patients with recent health utilization are \(e^{-0.40873422} = 0.6644908\) times as likely to recieve primary care at any given time that patients who have no recent health utilization. p-value for this coefficient is 4.755411e-02, which means we reject the null hypothesis that recent health utilization does not correlate with ’risk’ of seeking primary healthcare (with all else constant).

age‘s coefficient estimate is 0.02386218 which means that a one year increase in age corresponds to a \(e^{0.02386218} = 1.024149\) multiplicative factor increase in ’risk’ to link with primary care at any given time. p-value for this coefficient is 4.548318e-02 , which means we reject the null hypothesis that age does not correlate with ‘risk’ of seeking primary healthcare (with all else constant).

Discussion

Does results answer question?

How did we deviate from question in the process? Anything interesting

Be clear with why we accept of reject null hypotheses

Relate work to previous research

Interesting points (potentially) * Variables that are significant tend to be binary variables with an even distribution of yes/no answers (or even more lob-sided towards yes (1) response). We suspect this is because more observations in each group directly translates to higher power, a lower absolute difference can result in lower (and potentially significant) p-values.

Sources/References

Perception of Health and Use of Health Care Services in a Swedish Primary Care District. A ten Year’s Perspective https://www.tandfonline.com/doi/pdf/10.3109/02813439109026592